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CN ' Abstract 

We consider the problem of scheduling a set of n tasks on m processors under precedence, commu- 
nication, and global system energy constraints to minimize makespan. We extend existing scheduling 
models to account for energy usage and give convex programming algorithms that yield essentially the 
\Q . same results as existing algorithms that do not consider energy, while adhering to a strict energy bound. 

1 Introduction 

n' 

. ' We consider the problem of scheduling a set of n tasks on m homogeneous processors under precedence, 

Q ■ communication, and global system energy constraints to minimize makespan. This problem is of particular 

importance in portable and embedded systems Il2n [ni20]|4l. where the power demands of a growing number 
of computationally intensive applications |[Tni22l outmatch the growth rate of battery energy density ||T61 . It 
^ ' is also important in high-performance systems and data centers [31] where the operational costs of powering 

and cooling [6][l7l|9l and related reliability issues from power dissipation [34. 17] are substantial. Because 
multi-core processors are the industry's answer to the power and thermal constraints limiting clock speed 
' ll27l l8l. general scheduling methods that conserve energy and minimize running times are needed to fully 

. and efficiently use these systems. 

The problem of scheduling tasks on processors to minimize the makespan (overall runtime) has a long 
history Q. In the most general form, we are given a number of tasks to assign to processors, such that no 
processor may run more than one task at a time and the goal is to minimize the makespan (latest completion 
time). Distributing the tasks amongst processors typically reduces the makespan; however the problem 
presents several issues which make the solution non-trivial. First, a system has a finite number of processors 
■ which is typically much less than the number of tasks, implying that some tasks must be allocated to the same 

processor (thus potentially delaying their completion). Second, there is communication between tasks in the 
form of precedence constraints. When one task requires the output of another, we cannot schedule them to 
run in parallel on different processors. Third, there is communication between processors; the implication 
is that if the output of task i is required by task j which is scheduled to run on a different processor, then 
there will be some additional time delay Cij after the completion of i for the necessary information to arrive. 
Note that this delay can be avoided by scheduling i,j on the same processor. 

We further consider a overall bound on the total energy consumed by all tasks. Our goal is to produce 
a smooth tradeoff between the total energy consumption and the makespan, which is achievable by varying 
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the energy bound. We observe that some previous scheduhng models permitted re-computation of tasks 
(computing the same task multiple times on different processors to save on communication delays); this is 
energy-wasteful and our models will assume that such re-computation does not occur. Previous work on 
energy often tried to reduce the energy consumption without any reduction to the makespan; this approach 
has some merits but essentially treats energy as a "secondary" objective rather than producing a tradeoff 
between energy and makespan. Our model also extends previous work by allowing the tradeoff between 
running time and energy to be arbitrary (but convex) and task-dependent; this makes sense in the context of 
tasks which heavily load different system components (i.e. processor, network card, memory) and therefore 
may behave differently under speed-scaling. 

Problem Definition We formally state the problem as follows. We are given (1) a set J of n tasks; (2) m 
processors; (3) a directed acyclic graph G on n nodes representing precedence constraints between tasks; 
(4) a communication delay cij for each edge in G; (5) a set of energy/time tradeoff functions ej{dj); 
and (6) an energy bound E. 

The goal is to construct a schedule a = {{j,I,p)} where task j is assigned over time interval / to 
processor p, such that (a) if there is an edge from i to j in G, then task i must finish before j can begin; (b) 
each processor can only work on one task at any time; (c) dj is the total time for which task j runs over all 
assigned intervals; (d) the total energy used, J2j ^ji^j), is bounded by E; and (e) the time at which the last 
task completes (the makespan) is minimized. 

When the communication delays are non-zero, we restrict attention to schedules that do not migrate 
tasks, and require the additional constraint (f) that if is an edge in G and i and j are scheduled on 
different processors, then j cannot start until Cij time after i has completed. 

Previous Work Most previous work is experimental and considers the problem in the context of dynamic 
voltage scaling, for which speed (operating frequency) is approximately proportional to the voltage and for 
which power is approximately proportional to voltage cubed |[30l . This work includes more recent heuristics 
that minimize makespan given a hard energy bound 11211 [l] as well as heuristics that minimize energy given 
hard timing constraints lOTl l33l [TTl |20] |6 1. For both variations, the approach generally taken is to create an 
initial schedule in which tasks are scheduled at the highest speed possible, and then to reduce the speed of 
tasks in such a way that the schedule length does not increase. 

Several heuristic approaches also incorporate mathematical programming. Rountree et al ||29l use a 
linear program to bound the optimal solution from below. Kang and Ranka 1 19| and Leung et al fT2] use 
heuristics to approximate the optimal solutions to integer programs. Zhang et al [35 1 use a mathematical pro- 
gram to perform voltage scaling after the tasks have been scheduled. Unlike our methods, these approaches 
do not yield schedules with provable guarantees. 

The only previous work that yields schedules with provable guarantees while considering energy is by 
Pruhs et al |26|. They develop a speed-scaling processor model for precedence-constrained tasks with three 
primary results. The first is a proof that running all processors at a single fixed speed is at best Q.{poly{m)) 
approximate. The second is a proof that the total power across all processors is constant over time in any 
optimal solution. The third result is an algorithm that is 0(1) approximate in energy and 0(log^^^/" m) 
approximate in the makespan for a model-dependent constant a. This algorithm reduces the problem for 
m < n processors to the problem of scheduling tasks on m related processors (considered by Chekuri and 
Bender in |2J) that run at speeds that are a function of a power level p, which is chosen using a binary search 
to give the energy bound. We improve on this result in section 12.21 where we combine convex programming 
with the list scheduling result by Graham |[T2]|T3l to obtain an algorithm that is (2 — 1/m) -approximate in 
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the makespan and satisfies tlie energy bound exactly. 

Algoritlims for precedence-constrained task sciieduling that give provable guarantees on makespan (but 
without accounting for energy considerations) have been an active area of research since the last millen- 
nium. Graham |[T2l[T3l gives a (2 — l/m)-approximation algorithm for the m < n case and Fujii et al ||7l 
and Coffman and Graham ||3l give exact algorithms for the m = 2 case that use preemption and migration. 
Papadimitriou and Yannakakis f24'. '25] give a 2-approximate algorithm for the m > n, common commu- 
nication delay Cij = r case that uses recomputation, and Jung et al [ 1 8 1 give an exact algorithm for this 
case that is exponential in fixed integer r. Munier and Konig [23] give an algorithm for m > n and small 
communication delays (where the duration of a task is at least p > 1 times any communication delay) that 
is /3-approximate for (3 = , and Hanen and Munier fTT, T5 ] give an algorithm for the m < n case that 
is (1 + — l/m))-approximate. A recent survey by Drozdowski [5 1 details algorithms and heuristics for 
the scheduling problem with communication delays. 

Our Contributions Our primary contribution is to show that convex programming formulations of many 
scheduling problems permit energy constraints to be simply appended; this enables us to produce compara- 
ble provable results to the energy-blind case without substantially more complex analysis. In section |2] we 
consider the case where the communication delays are all zero. We obtain optimal schedules when m > n 
and for m = 2; and for m < n, we obtain (2 — 1/m) -approximate schedules. These results are analogous 
to the energy -blind results of fl^TSl and ||7]|3], and improve on the result in [26|. 

In section [3] we consider the case where the communication delays may be non-zero. For small commu- 
nication delays we obtain /3-approximate schedules form > n processors, /3 = j^p^, and — 1/m))- 
approximate schedules for arbitrary m. These results extend the energy -blind results in ||T4] |23l [T5l . For 
large communication delays we extend the approach of fTOl to obtain ^^^^^ -approximate schedules. In 
these cases, p and R are parameters to the algorithm that bound the relative size of the delays. 

Discussion The variants for which we obtain approximations are NP-complete ll32l l28l l24ll and are there- 
fore unlikely to have fast exact solutions. We obtain our results modulo e error due to the finite precision 
used in solving convex programs. To simplify our analyses we do not mention this term further. 

Except in section 1231 our algorithms do not use preemption (stopping and starting of tasks), migration 
(moving tasks from one processor to another), or recomputation (computing a task more than once.) Our 
algorithms in section |2] are approximate even to optimal algorithms that do have these properties. Our 
algorithms in section[3]are approximate to optimal algorithms that do not have these properties. 

We consider cases in which ej and dj are. inversely related according to a convex function; that is, 
ej{dj) is convex and non-increasing. It is natural to make this assumption for several reasons. Convexity 
is a more general assumption than either the speed scaling model in 1261 or the dynamic voltage scaling 
model assumed by the experimental work cited above; the time/energy tradeoff in both of these models is 
convex. We consider separate functions for each task because they may vary in their use of resources and 
therefore may not have the same curve, even when a measure of "work" for a task is taken into account. 
Lastly, because we consider a homogeneous system in which there is no contention for resources (other than 
processors) by tasks, the functions are independent of processors and of other tasks. 

Notation We use the following notation throughout the paper, i, j, and k are tasks, p and q are processors. 
Pj is the processor on which j is scheduled. We write i^j if the edge exists in G, i < j if there is 
a (directed) path from i to j in G, and i ~ j if neither i < j nor j < i. The duration of a task j is dj and 
the energy it consumes is Cj ; these are inversely related in a model-dependent manner, tj is the start time 
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of task j. o" is a schedule, /i is a makespan and E is the energy bound. We write tj{ai), /i((T2), etc. to 
differentiate between schedule values. Task j is a source task if there is no i^j and a sink task if there is 
no j— T-Zc. We say a task j is active if it is currently running on a processor and available if its predecessors 
have all completed (regardless whether it's active.) We say a processor is active at time t if it is working on 
a task at time t, else idle. 

All of these parameters are required to be non-negative. We abuse notation and use I to refer to both an 
interval / = [a,b] and its length b — a, and we use S to refer to both the set S and the number of elements it 
contains. If S is a set of disjoint intervals, we also write S for J2ieS ^■ 

Organization We organize the rest of the paper as follows. Section |2]contains our results for the case in 
which the communication delays are all zero. In section |3] we extend the results of fT4l l23][T5l to the case in 
which there are small communication delays and the result of [10.1 to the case in which the delays are large. 

2 Zero Communication Delays 

In this section we consider a model in which a task j may be started as soon as all i^j have completed and 
there is a free processor. Our algorithms in this section are competitive against preemptive and migratory 
algorithms as a result of lemma 12.21 Recomputation never helps in this model; if a schedule a that computes 
a task j more than once removes the second computation, the total energy will decrease, and every task can 
still begin at tj{a). We therefore assume that no schedule uses recomputation. 

2.1 m> n Processors 

We use the following convex program in the m > n case. We assume that ej{dj) can be computed in 
polynomial time. 

Program 2.1. 

Minimize fi subject to: 

1. tj > ti + di for all i^j 

2. fJ. > tj + dj for all j 

H, tj, dj > 

Constraint |4] will be necessary in section |Z21 In the case where m > n, this only constrains the makespan 
to be at least the average duration. 

We show that this convex program is equivalent to the scheduling problem. 

Theorem 2.2. There is a solution n, tj,dj to prosram \2. 1 1 iff there is a feasible schedule a with makespan 
/i. 

Proof. We prove the theorem by constructing a solution to one out of a solution to the other. The following 
two algorithms perform these conversions. □ 
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Algorithm 2.3. Given a solution fj.,tj, dj to pro gram [2. II schedule each task j in a at time tj with duration 
dj . Always schedule j on its own processor. 

Precedence and energy constraints are satisfied immediately, j does not cause co-occurrence conflicts 
with any other task k because it is on its own processor, a uses n < m processors. 

Algorithm 2.4. Given a schedule cr on m processors, construct /x, tj, dj as follows. Set fj, = makespan(cj) 
and tj = tj{a). Let Sj be the set of intervals over which j is active in a. Set dj = Sj. 

Constraint 1 is satisfied because tj{a) is at least the end of the last interval in Si for i^j, and the first 
interval in 5^ does not begin until ti{a). Constraint 2 is satisfied similarly. Constraint 3 is satisfied because 
a is feasible, and constraint 4 is satisfied because a uses at most m processors and the makespan is at least 
the average load. 

Because program [2. II is a convex optimization problem with polynomially many constraints, it can be 
solved in polynomial time. To generate an optimal schedule a we first solve program 12.11 and then use 
algorithm l2.3l to construct the schedule. 

2.2 m Processors 

In the case where m is arbitrary we obtain a (2 — l/m)-approximation. 

Lemma 2.5. Given a schedule ai that satisfies program I2.il algorithm \2.6\ constructs a schedule 02 that 
has makespan /i2 ^ (2 — 1/m) ii\ and that uses at most E energy. 

Algorithm 2.6. While there are unscheduled tasks in (T2, let t be the earliest time for which there is both a 
processor p that has no tasks scheduled after time t and a task j whose predecessors i^j have all completed 
by time t. Schedule j for duration dj{ai) starting at time t on processor pma2. 

Proof. We cut the time interval (0, //2) at each point at which some task begins or ends, and partition these 
sub-intervals into two sets A and B, where A contains all intervals in which all m processors are active in 
£72 and B contains the rest. ^2 = A + B. We define W = J2j for convenience. 

We bound B < fii with a potential argument. For each time t in cj2 let Ft be the set of tasks finished 
by time t and Jt the set of available tasks at time t. We define cl){t) as the smallest time u in ai such that 
cji finishes all tasks Ft by time u and has completed at least as much work on all tasks in Jt as cj2 has 
completed. We cannot have > /xi since ai completes all tasks by time /zi. We must also have ^(0) = 
and 4>{ti) < 4>{t2) whenever ti < t2- 

For each interval I = (a, 5) in i? it must be that all Ja = Jt < rn available tasks are active; otherwise, 
we could have scheduled an available task on an idle processor. We must further have cj){b) — (f){a) > b — a; 
otherwise, ai could complete / Ja work on these Ja tasks in less than / time. Together, these yield 

B= b-a< ^ Hb)-cp{a) 

{a,b)€B {a,b)eB 

< Yl '^(^) - '^(«) = '^('"2) - m < Ail 

{a,b)eAUB 

02 completes at least / work over each interval 1 xrvB because in each such / there is at least one active 
processor; otherwise, t would be the earliest time at which we could schedule a task in 02, but we chose a 
time t! > t instead. 



5 



We bound A < {W — B)/m since there is no more than {W — B) work to do in A and all m processors 
are active in A. 

By constraint m in program l2.ll > W/m. Together with our bounds for A and B we have ^2 = 

A + B <W/m+{l-l/m)B <{2-l/m)ni. □ 

Theorem 2.7. We can construct a {2 — 1 / m)-approximation for the case when m is arbitrary. 

Proof. Let ^* be the optimal makespan obtainable for m processors on a given task graph G with energy 
bound E, using an optimal schedule a* . Program |2. 1 1 has an objective value /xi that is no larger than ji*; 
otherwise, we could set tj = tj{a*) and dj = dj{a*), and all constraints would remain satisfied. 

Since ^ui < fi*, we can can use algorithm 12.31 to create a schedule ai that uses n processors, and 
algorithm 12.61 to generate a schedule o"2 that uses m processors. By the lemmas, we have /i(cT2) < (2 — 
l/m)/ii < (2 - l/m)/i*. □ 



2.3 Two Processors 

In the case where m = 2 we can obtain an optimal schedule if migration and preemption are permitted. We 
combine convex programming with a fractional version of the matching-based algorithm in (71. 

We consider tasks j according to any linear ordering <l that is an extension of <g. The duration dj of 
a task j is broken into components. The time when task j is the only task running is £i. The time when task 
j is running at the same time as another task i <l j is 

Program 2.8. 

Minimize fi = J2j + J2i J^jy^i ^ij subject to: 

1. dj = £j + Zi<u ^ij + Ek>u for all j 

2. lij = for all i <g j 

We extend this linear program with the addition of a convex energy constraint. We consider dj to be a 
fixed value in program |2^ and a variable in program 12.91 

Program 2.9. 

Program [2.8 [ subject to the additional constraint: 

We can construct a solution to program [2.9l from a feasible schedule a by setting £j {tij) as the sum over 
intervals in which only task j (tasks i and j) are active. 

To construct a feasible schedule from a solution to the program, we use the following lemma as a 
subroutine. 

Lemma 2.10. Let S be the set of source tasks in G. Given a solution to program 12. 8\ with objective value 
/i, we can construct a new solution with the same objective value that satisfies the condition 3s £ S Vj ^ 
S {isj = 0); call this s the next task. 
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Proof. Let Z = \ i £ S,j ^ S,£ij > 0}. We say Z has inductive pairs and {h, k) if these pairs 

exist in Z and j ~ k. The proof is by complete induction on the set Z. 

If Z is empty, then the lemma holds for all j in S. If Z does not have inductive pairs, then we define 
T = {j \ 3i € Z)}. It must be that the tasks in T are fully ordered by <g (and not just <l); 

otherwise, we would have j k for some pairs. Let j be the least task in T. Since j is in T, it is not in S, 
and therefore there is some s in 5 such that s <g j- There can be no A: G T for which (s, A;) € Z; otherwise, 
•5 <G j <G k would violate constraint |2] Because s does not appear in Z, the lemma holds for s. 

For the inductive case, let and (/i, k) be inductive pairs in Z, so that j ~ k. We also have that i ^ h 
because i and h are both in S. We define A = min {iij,ihk}^ and update iih := lih + A, £jf^ := ^j^, + A, 
£ij ■■= iij - A, and ihk ■= hk - A. 

Neither dj nor ^ changes as a result of these updates. No I is updated to be less than zero. Constraint 

2 still holds because j ^ k and i ^ h. Therefore, this new solution is also feasible and has the same 
objective value. Because A = min {iij , Ihk} , either iij = or Ihk = 0, so Z decreases and we continue 
inductively. □ 

The following algorithm constructs a feasible schedule a iteratively. Program [2.9l is solved once initially 
to determine dj that satisfy constraint |3] In each iteration, a solution is maintained for program 12.81 using 
fixed values chosen so that the total durations are dj. 

Algorithm 2.11. Initially, solve program [2.9| for the input graph G to determine durations dj for each task 
j. Define Go = G, df- = dj, and ctq to be an empty schedule. 

Then, for each Z = 0,...,n — l,do the following. Let Si be the set of sources in Gi. Solve program |Z8] 
on Gi using fixed durations d^ . Run lemma [2?T0l as a subroutine to find the next task si G Si and updated 
values ^j, ^ij- To get a^+i, begin with ai, schedule task s alone for l\ time, and then for each other i G Gi 
schedule tasks s and i together for time. Define as Gi - {s} and = d[ - 4^. 

Our resulting schedule cr is (t„. 

We note that the updated values , i[j for all tasks except s constitute a feasible optimum for program 
l2.8l on Gi+i, so by saving these values we do not need to recompute a solution to this program. 

Theorem 2.12. Let 11(G) be prosram 12. 81 instantiated for graph G. a is feasible and has makespan fj-{cr) 
equal to the objective value fiQ ofIl{Go). 

Proof. Let ^u/ = ^((T/). We first prove that ;U/+i + jj^j^^ < f^i + for all /. We have /i^+i = ^/ + + 
Si>LS ^ii because we schedule s and (s, i) to get o-;+i. We also have that ^Y+i ^ /^p — — So^s 4*' 
otherwise, we could use = i[^^,ilt^ and get a better solution to n(G;). 

Inductively, we must have + nj^^^ < //q because jj^q = 0. We further have that = because Gn 
is empty. Therefore, /u((t) < jj^. Because we can construct a feasible solution to n(Go) from a, we also 
have fi{a) > fi^. 

We also must prove that a uses at most E energy. The dj were chosen by program 12.91 to satisfy 
constraint |3] Program |2. 8 1 satisfies constraint [T] with equality. Because we update d^^^ = d[ — £[^, the total 
duration for which each task j runs is dj, and therefore the total energy used is at most E. □ 

3 Small Communication Delays 

In this section we consider a model in which each edge i^j in G has an associated communication delay 
of Cij. When task i finishes, if task j is scheduled on a different processor, it may not begin until at least Cij 
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time has passed. (If j is started on the same processor, it may begin immediately after i finishes, as long as it 
is not otherwise constrained.) This models a system in which there is a time/energy trade-off for processors 
and a constant-speed interconnect running at s data per time that must transfer s Cij data from pi to pj and 
can make any number of point-to-point transfers at a time. 

Because we consider small communication delays, we assume a given p > 1 for which Cij < d^/p 
for all i— s-j and k, and we compare against optimal solutions that also satisfy this constraint. Our results 
improve with increasing p (and smaller communication delays.) 

3.1 m>n Processors 

We extend the approach of | fT4l |23l to account for energy. We first show that the following non-convex 
program is equivalent to the scheduling problem, xij is an indicator variable that is 1 if j follows i on 
processor pi without waiting the Qj communication delay time, and otherwise. 

Program 3.1. 

Minimize p subject to: 

1. > + + (1 — Xij) Cij for all i^j 

2- ^ij < 1 for all i 

3- Xij < 1 for all j 

4. p > tj + dj for all j 

5. Xij G {0, 1} for all i^j 

7. Cij < dk/p for all i^j, k 
p, tj, dj > 

Lemma 3.2. There is a solution p, tj , dj , Xij to program \3. 1 \ iff there is a feasible schedule a with makespan 
p that satisfies di/p > Cij and that does not use preemption, migration, or recomputation. 

Proof. We prove the lemma by constructing a solution to one out of a solution to the other. The following 
two algorithms perform these conversions. □ 

Algorithm 3.3. Given a solution p,tj,dj,Xij to program |3.1[ construct a as follows. Consider tasks j 
according to any linear extension <l of <g- Schedule each task j at time tj with duration dj. If there is 
some i— )-j for which Xij = 1 schedule j on pi and on a new processor otherwise. 

Precedence, communication, and energy constraints are satisfied immediately. If j is scheduled on a 
new processor, then j does not cause co-occurrence conflicts with any prior task k <l j because it is on 
a new processor. If j is scheduled on pi, then it could co-occur with a prior task k <l j only if k is also 
scheduled on pi after i. But k is not scheduled on pi after i because Xij = 1 and therefore Xik = Xkj = by 
constraints 2 and 3, so in this case j also does not cause co-occurrence conflicts. 

Algorithm 3.4. Given a schedule a, construct p, tj,dj,Xij as follows. Set p = makespan(o"), tj = tj{a), 
dj = dj{a). Set Xij = 1 if tj < ti + di + Cij and Xij = otherwise. 
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All constraints except for 2, 3, and 8 are satisfied immediately. Suppose Xij = 1. Since tj < ti + di + Cij 
and Cij < dk, it must be that no other task k is scheduled between tj + di and tj on processor pi. Therefore, 
for any other task k where i— s-A; or k^j, xik = Xkj = 0, so constraints 2 and 3 hold. Lastly, makespan((j) 
is at least the average load on each processor, and therefore constraint [8] holds. 

We note that we need p > 1 to enforce that no task k can be between i and j on p^; if p < 1 then we 
could possibly have dk < Cij , which would break our argument. 

We relax program [3. II to a convex program by requiring only that < Xij < 1 rather than Xij £ {0, 1}. 
We show that the following deterministic rounding algorithm gives a feasible schedule a that satisfies the 
energy bound E and is /3-approximate in the makespan for (3 = j^^- 

Algorithm 3.5. Solve the convex relaxation of program l3.ll For each Xij, Define Xij = I if Xij > 1/2 
and Xij = otherwise. Take any linear extension <l of <g- For each j in order of <l, if there is an i^j 
such that Xij = 1, schedule j on pi to begin at time ti{a) + di. If there is no such i, schedule j on its own 
processor at time maxj_^j + di + Cij}. Always schedule j for duration dj. 

Proof. We first prove the stronger claim that for every task j, tj{a) < 13 tj. We prove this inductively on 
If j is a source task, then there are no i^j, so tj{a) = < /3 tj. 

If j is not a source task, then j is scheduled either on pi for some i^j or on its own processor. If j is 
scheduled on Pi then Xjj = 1, and therefore tj (a) = ti{a)+di < fiti+di < /3 tj by induction and constraint 
1. Otherwise, there is some i^j for which j is scheduled at time tj{a) = ti{a) + + Cjj < /3 tj + + Cjj. 
Xjj = 0, so Xjj < 1/2, and therefore 1 — Xjj > 1/2 and tj > tj + dj + Cjj/2. With /3 = and Cjj < di/p, 
algebraic manipulation yields tj{a) < (3tj. 

With this stronger claim, we can bound the makespan by 

/i(cj) = maxj {tj(o") + dj} < maxj {/3 tj + dj} < /3 fi 

We now prove that u is a feasible schedule. Precedence, communication, and energy constraints are satisfied 
immediately. If j is scheduled on a new processor, then j does not cause co-occurrence conflicts with any 
prior task k <l j because it is on a new processor. If j is scheduled on pi, then Xj^ = x^j = for any other 
task k, so in this case j also does not cause co-occurrence conflicts. □ 



3.2 m Processors 

We use the following lemma from ifTSl as a black-box to obtain a bound for m processors. 

Lemma 3.6. In the case where Cjj < dk/ p, given a schedule ai with makespan fii that uses more than m 
processors we can construct a new schedule a2 with makespan 112 ^ ^ Yljidj) + (1 ~ l/"i) Pi that uses 
only m processors. 

The model considered in |[T5l assumes fixed durations, so dj{ai) = dj ((T2) and therefore (T2 uses at 
most E energy as well. 

Theorem 3.7. We can construct a (1 + /3(1 — \ / m))-approximate schedule a for the case in which m < n. 

Proof. We use algorithm 13.51 to generate a schedule ai with makespan pi < /? p, where p is the optimal 
objective value of program IXTI 

Let a* be an optimal schedule for m processors with makespan p*. We must have p* > p; otherwise, 
we could use algorithm l3.4l to convert a* into a feasible solution to program [3. II with smaller p. 

a* satisfies constraint [8]of program ISTI and therefore fi* > ^ Ylj dj- These three bounds plus the bound 
in lemma I3.6l yield the theorem. □ 
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3.3 Large Communication Delays 

We use the approach of ifTOl to obtain a ^^^3^^^ -approximate schedule a2 in the case of large communication 
delays where Cij/d^ < R. 

Algorithm 3.8. Define Cij = Cij/R. Run algorithm l3.5l on G, Cij, and p = 1 to get a schedule ai. Scale up 
each communication delay Cij in ai by R to get a2- 

Theorem 3.9. o"2 is a — --approximate schedule on m > n processors. 

Proof. Let ^ J be the optimal makespan for q j and fi* the optimal makespan for Cij . It is clear that ^ J < ^* ; 
otherwise, when we scale down by R we could get a better solution. 

Let /xi = /x(cri) and /x = /u(cj2). We have ^ui < by algorithm 13.51 /xi is the length of some critical 
path P of tasks in ai; P has g tasks and g — 1 communication delays (some of which may be 0.) 

Let A be the contribution of task durations to /xi and B be the contribution of delays. For each task k in 
P and each delay Cy in P, we have Cy < d^. Therefore, A > fii/2. When we scale the Cy up by R to get 
dj in 0-2, we get /X < yl + < i/xi + f /xi < ^^^tii/xj, so ^ < ^^^/x*. □ 

4 Conclusions 

We have shown that for several scheduling problems, using convex programming, we can obtain approx- 
imation bounds when energy constraints are present that are no worse than the existing bounds obtained 
when energy is not considered. Our analyses for the most part used the analyses of algorithms for the 
corresponding energy-blind cases, adjusted where needed to fit the convex programming formulation. 

One possible direction for future work is to characterize the conditions necessary for the approach in this 
paper to be applicable. We rely heavily on the fact that, once the time/energy allocations are determined, 
our problem is essentially an instance of the energy-blind problem, and can be solved using energy-blind 
methods. 

As an example, recomputation is a consideration for which our approach appears to break. In scheduling 
models that permit multiple copies of tasks to be computed on different processors, such as the model 
considered by Papadimitriou and Yannakakis in ||25| . the energy for each copy should be taken into account. 
More work is necessary to determine the extent to which our methods are applicable to these models. In 
particular, a convex programming formulation that would permit energy constraints to be appended is not 
obvious and would be interesting. 
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